-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor scale-down for better integration with drainability rules #6135
Refactor scale-down for better integration with drainability rules #6135
Conversation
Welcome @artemvmin! |
/assign |
fe5cf63
to
aba1391
Compare
c162820
to
e53147e
Compare
/cc @olagacek |
@x13n: GitHub didn't allow me to request PR reviews from the following users: olagacek. Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@@ -106,39 +87,6 @@ func GetPodsToMove(nodeInfo *schedulerframework.NodeInfo, deleteOptions NodeDele | |||
if err != nil { | |||
return pods, daemonSetPods, blockingPod, err | |||
} | |||
if pdbBlockingPod, err := checkPdbs(pods, pdbs); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By moving this check before GetPodsForDeletionOnNodeDrain
, you're changing the logic - it will now operate on a different set of pods. In particular, it will start to check PDBs for DS pods, which I think doesn't make sense - we don't want to block node removal on this. I think rewriting GetPodsForDeletionOnNodeDrain
into drainability rules first (as mentioned in TODO above) would be a safer approach, since then you could preserve the ordering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I became aware of this ordering issue early on and decided to turn this PR into the full story. I reordered the commits and updated the title. I'll ping for a follow-up review when the remainder is implemented.
// require adding information to the DrainContext, such as the slice of pods | ||
// and a flag to prevent duplicate checks. | ||
for _, pdb := range drainCtx.Pdbs { | ||
selector, err := metav1.LabelSelectorAsSelector(pdb.Spec.Selector) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this implementation we're doing the same conversion O(N*M) times instead of O(N) as before. (Where N is the number of PDBs and M is the number of pods.) Could we keep selectors, rather than just raw pdbs, in the context? Or - even better - reuse RemainingPdbTracker
which already operates in that way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a separate draft passing RemainingPdbTracker directly, but without using DrainCtx. The combination of these two ideas seems to be the sweet spot. Initially didn't follow through with it because I got scared of the asynchronous go-routines using the NodesToRemove function (https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/core/scaledown/actuation/actuator.go#L219) and the lack of multi-thread safety in the RemainingPdbTracker object. Do you think this is an issue?
I added the refactor to this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good question. As far as I can tell though, you're using a new instance of the tracker in each such goroutine. This is perhaps suboptimal, but should be safe.
...r-autoscaler/processors/scaledowncandidates/emptycandidates/empty_candidates_sorting_test.go
Outdated
Show resolved
Hide resolved
84ad488
to
98effe0
Compare
The current refactor looks good to me now. Are you sure you want to rewrite all the checks in one humongous PR? You may start getting into merge conflicts, so I'd suggest following up in a separate PR - WDYT? Btw, in the release notes you're adding some actions required - while this is true for downstream/forked code, it may be confusing for OSS CA release notes (which is the intended audience for these). I don't think this change should have any user-visible changes. |
Sounds good. Thanks for the tip. I updated the PR title.
That makes sense. Updated. Please review. |
98effe0
to
8f2532a
Compare
8f2532a
to
5a2f46e
Compare
@@ -106,7 +90,7 @@ func GetPodsToMove(nodeInfo *schedulerframework.NodeInfo, deleteOptions NodeDele | |||
if err != nil { | |||
return pods, daemonSetPods, blockingPod, err | |||
} | |||
if pdbBlockingPod, err := checkPdbs(pods, pdbs); err != nil { | |||
if pdbBlockingPod, err := checkPdbs(pods, remainingPdbTracker.GetPdbs()); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you could already delete checkPdbs
function now and just use RemainingPdbTracker.CanRemovePods()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I wasn't sure what the "legacy scale-down" was referring to in the comment below so I left it alone. The logic looks identical if the parallel return value is ignored.
// require adding information to the DrainContext, such as the slice of pods | ||
// and a flag to prevent duplicate checks. | ||
for _, pdb := range drainCtx.Pdbs { | ||
selector, err := metav1.LabelSelectorAsSelector(pdb.Spec.Selector) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good question. As far as I can tell though, you're using a new instance of the tracker in each such goroutine. This is perhaps suboptimal, but should be safe.
/lgtm I only really have one minor comment, but it doesn't block this PR, so feel free to cancel the hold if you don't want to address it now. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: artemvmin, x13n The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Comments addressed. /unhold |
Thanks! /lgtm |
5892f72
to
9ea5a36
Compare
/lgtm |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This is one of several CLs to move all drain conditions to drainability rules. Once complete, clients implementing custom drainability rules will have full control over the scale-down of nodes.
Notable changes:
simulation/drainability:Rule.Drainable()
function now takes aDrainContext
. This function can assume thatDrainContext
is not nil.simulation:NodeDeleteOptions
has been split into two structs:simulation/options:NodeDeleteOptions
anddrainability/rules:Rules
. Consumers of this struct have been updated accordingly.Does this PR introduce a user-facing change?